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Abstract 

Computation plays a major role in decision making. Even if 
an agent is willing to ascribe a probability to all states and 
a utility to all outcomes, and maximize expected utility, do- 
ing so might present serious computational problems. More- 
over, computing the outcome of a given act might be diffi- 
cult. In a companion paper we develop a framework for game 
theory with costly computation, where the objects of choice 
are Turing machines. Here we apply that framework to de- 
cision theory. We show how well-known phenomena like 
first-impression-matters biases (i.e., people tend to put more 
weight on evidence they hear early on), belief polarization 
(two people with different prior beliefs, hearing the same ev- 
idence, can end up with diametrically opposed conclusions), 
and the status quo bias (people are much more likely to stick 
with what they already have) can be easily captured in that 
framework. Finally, we use the framework to define some 
new notions: value of computational information (a compu- 
tational variant of value of information) and computational 
value of conversation. 

1 Introduction 

Computation plays a major role in decision making. Even if 
an agent is willing to ascribe a probability to all states and 
a utility to all outcomes, and maximize expected utility — 
that is, to follow the standard prescription of rationality as 
recommended by Savage [1954|, doing so might present 
serious computational problems. Computing the relevant 
probabilities might be difficult, as might computing the rel- 
evant utilities. Work on Bayesian networks MPearl 19881 
and other representations of probability, and related 
work on representing utilities (Bacchus and Grove T995J 
|Boutilier, Brafman, Domshla k, Hops, and Poole 2 004 1 can 
be viewed as attempts to ameliorate these computational 
problems. Our focus is on the complexity of computing the 
outcome of an act in a given state. Consider the following 
simple example, taken from [Halp ern and Pass 2010) . 
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Suppose that a decision maker (DM) is given an input n, 
and is asked whether it is prime. The DM gets a payoff of 
$1,000 if he gives the correct answer and loses $1,000 if he 
gives the wrong answer. However, he also has the option 
of playing safe, and saying "pass", in which case he gets a 
payoff of $ 1 . Clearly, many DMs would say "pass" on all but 
simple inputs, where the answer is obvious, although what 
counts as a "simple" input may depend on the DMQ 

In [Halpern and Pass 2010[ , we introduced a model of 
game theory with costly computation. Here we apply that 
framework to decision theory. We assume that the DM can 
be viewed as choosing an algorithm (i.e., a Turing machine); 
with each Turing machine (TM) M and input, we associate 
its complexity. The complexity can represent, for example, 
the running time of M on that input, the space used, the 
complexity of M (e.g., how many states it has), or the diffi- 
culty of finding M (some algorithms are more obvious than 
others). We deliberately keep the complexity function ab- 
stract, to allow for the possibility of representing a number 
of different intuitions. The DM's utility can then depend, 
not just on the payoff, but on the complexity. 

The DM's goal is to choose the "best" TM; the one that 
will give him the greatest expected utility, taking both the 
payoff and complexity into account. To make this choice, 
the DM must have beliefs about the TM's running time 
and the "goodness" of the TM's output. For example, if 
the TM outputs "prime" on some input n, then TM must 
have beliefs about how likely n is to actually be prime. As 
this example suggests, we actually need here to deal with 
what philosophers have called "impossible" possible worlds 
IHinti kka 19751 IRantala 19821 . If n is a prime, then this 
is a mathematical fact; there can be no state where n is not 
prime; nevertheless, since we want to allow for DMs that are 
resource-bounded and cannot compute whether n is prime, 
we want it to be possible for the DM to believe that n is 
not prime. Similarly, if the complexity function is supposed 
to measure running time, then the actual running time of a 
TM M on input t is a fact of mathematics; nevertheless, we 

'While primality testing is now known to be in poly- 
nomial time (Agrawal, Keyal, an d Saxena 2004 1, and there are 
computationally-efficient randomized algorithms that that give 
the correct answer with extremely high probability IRabi n~1980l 
|Solovay and Strassen 1 977 1, we can assume that the DM has no 
access to a computer. 



want to allow the DM to have false beliefs about M's run- 
ning time. We capture such false beliefs by having both the 
utility function and the complexity function depend on the 
state of nature. 

As we show here, using these simple ideas leads 
to quite a powerful framework. For example, many 
concerns expressed by the emerging field of behav- 
ioral economics (pioneered by Kahneman and Tversky 
IKahnema n, Slovic, and Tversky 1982) ) can be accounted 
for by simple assumptions about players' cost of com- 
putation. To illustrate this point, we show that first- 
impression-matters biases [Rabin 1998 1, that is, that peo- 
ple tend to put more weight on evidence they hear 
early on, can be easily captured using computational as- 
sumptions. We can similarly explain belief polarization 
| Lord, Ross, and Lepper 1979| — that two people, hearing 
the same information (but with possibly different prior 
beliefs) can end up with diametrically opposed conclu- 
sions. Finally, we can also use the framework to formal- 
ize one of the intuitions for the well-known status quo bias 
ISamuelso n and Zeckhauser 199 81: people are much more 
likely to stick with what they already have. 

As a final application, we use the framework to define a 
new notion: value of computational information. To explain 
it, we first recall value of information, a standard notion in 
decision analysis. Value of information is meant to be a mea- 
sure of how much a DM should be willing to pay to receive 
new information. The idea is that, before receiving the infor- 
mation, the DM has a probability on a set of relevant events 
and chooses the action that maximizes his expected utility, 
given that probability. If he receives new information, he can 
update his probabilities (by conditioning on the information) 
and again choose the action that maximizes expected utility. 
The difference between the expected utility before and after 
receiving the information is the value of the information. 

In many cases, a DM seems to be receiving valuable infor- 
mation that is not about what seem to be the relevant events. 
This means that we cannot do a value of computation calcu- 
lation, at least not in the obvious way. For example, suppose 
that the DM is interested in learning a secret, which we as- 
sume for simplicity is a number between 1 and 1000. A 
priori, suppose that the DM takes each number to be equally 
likely, and so has probability .001. Learning the secret has 
utility, say, $1,000,000; not learning it has utility 0. The 
number is locked in a safe, whose combination is a 40-digit 
binary numbers. What is the value to the DM of learning the 
first 20 digits of the combination? As far as value of infor- 
mation goes, it seems that the value is 0. The events relevant 
to the expected utility are the possible values of the secret; 
learning the combination does not change the probabilities 
of the numbers at all. This is true even if we put the possi- 
ble combinations of the lock into the sample space. On the 
other hand, it is clear that people may well be willing to pay 
for learning the first 20 digits. It converts an infeasible prob- 
lem (trying 2 40 combinations by brute force) to a feasible 
problem (trying 2 20 combinations). 

Although this example is clearly contrived, there are many 
far more realistic situations where people are clearly will- 
ing to pay for information to improve computation. For ex- 



ample, companies pay to learn about a manufacturing pro- 
cess that will speed up production; people buy books on 
speedreading; and faster algorithms for search clearly are 
considered valuable. We show that we can use our compu- 
tational framework to make the notion of value of computa- 
tional information precise, in a way that makes it a special 
case of value of informationQ In addition, we define a no- 
tion of computational value of conversation, where the DM 
can communicate interactively with an informed observer 
before making a decision (as opposed to just getting some 
information). Interestingly, the notion of zero knowledge 
[Goldwasser, Micali , and Rackoff 1 9891 gets an elegant in- 
terpretation in this framework. Roughly speaking, a zero- 
knowledge algorithm for membership in a language L is one 
where there is no added value of conversation in running the 
algorithm beyond what there would be in learning whether 
an input x is in L, no matter what random variable is of in- 
terest to the DM. 

In the next section we define our computational frame- 
work carefully, and show how it delivers reasonable results 
in a number of examples. In Section[3] we consider the value 
of computational information. We conclude with a discus- 
sion of related work in Section [4] 

2 A computational framework 

The framework we use here for adding computation to deci- 
sion theory is essentially a single-agent version of what were 
called in [Halpern and Pass 2010 1 Bayesian machine games. 
In a standard Bayesian game, each player has a type in some 
set T, and then makes a single move. Player i's type can be 
viewed as describing i's initial information; some facts that 
i knows about the world. In the number-in-the-safe exam- 
ple, there is essentially only one type, since the DM gets no 
information. In the case of the manufacturing process, the 
type could be the configuration of the system; manufactur- 
ing processes typically apply to a number of configurations. 
We assume that an agent's move consists of choosing a Tur- 
ing machine. As we said in the introduction, associated with 
each Turing machine and type is its complexity. Given as in- 
put a type, the Turing machine outputs an action. The utility 
of a player depends on the type profile (i.e., the types of all 
the players), the action profile, and the complexity profile. 
(While typically all that matters to player i is the complexity 
of his algorithm, it may, for example, matter to him that his 
algorithm is faster than that of player j.) 

Turning to decision theory, we take a standard deci- 
sion problem with types to be characterized by a tuple 
(S, T, A, Pr, u), where S is a state space, T is a set of types, 
A is a set of actions, Pr is a probability distribution onSxT 
(there may be correlation between states and types), and 
u : S x T x A ^ M, where u(s,t,a) is the DM's util- 
ity if he performs action a in state s and has type i0 (It is 

2 Our notion of value of computational information is related 
to, but not quite the same as, the notion of value of computation 
introduced by Horvitz Q987 2001); see Section|4] 

3 In | Hal pern and Pass 2010) , we did not have a state space S, 
but we assumed that nature had a type. Nature's type can be iden- 
tified with the state. 



not typical to consider a decision maker's type in standard 
decision theory, but it does not hurt to add it; it will prove 
useful once we consider computation.) For each action a, 
we can consider the random variable u a defined on S by 
taking u a (s, t) = u(s,t,a). The expected utility of action 
a, denoted Ep T [u a ], is just the expected value of the ran- 
dom variable u a with respect to the probability distribution 
Pr; that is, E Pl [u a ] = J2( s ,t)eSxT Pr ( s > t)u(s, t, a). We 
assume that the DM is an expected utility maximizer, so he 
chooses an action a with the largest expected utility. 

To combine the ideas of Bayesian machine games and de- 
cision problems, we consider computational decision prob- 
lems. In a computational decision problem, just like in a 
computational Bayesian machine game, the DM chooses a 
Turing machine. We assume that the action performed by 
the TM depends on the type. We denote by M(t) the out- 
put of the machine on input the type t. To capture the DM's 
uncertainty about the TM's output, we use an output func- 
tion : M x S x T 4 IV, where M denotes the set of 
Turing Machines; 0(M, s, t) is used to describe what the 
DM thinks the output of M(t) is in state s. To simplify the 
presentation, we abuse notation and use M(s,t) to denote 
0(M,s,t). 

The DM's utility will depend on the state s, his type t, 
and the action M(s,t), as is standard; in addition, it will 
depend on the "complexity" of M given input t. The com- 
plexity of a machine can represent, for example, the run- 
ning time or space usage of M, or the complexity of M 
itself, or some combination of these factors. For example, 
Rubinstein II 19861 considers what can be viewed as special 
case of our model, where the DM chooses a finite automa- 
ton (and has no type); the complexity of M is the number 
of states in the description of the automaton. To capture the 
cost of computation formally, we use a complexity function 
C:MxSxT->K,to describe the complexity of a TM 
given an input type and state. (As we shall see, by allowing 
the state to be included as an argument to C, we can capture 
the DM's uncertainty about the complexity.) 

We define a computational decision problem to be a tu- 
ple V = (S,T,A,Pi,M,C,0,u), where S, T, A, and 
Pr are as in the definition of a standard decision problem, 
M C M is a set of TMs (intuitively, the set that the DM 
can choose among), O is an output function, C is a com- 
plexity measure, and u: SxTxAxJN^M. The 
expected utility of a TM M in the decision problem T> is 

UseS.teT Pr ( s > *M S ) *> s > *)) C ( M > s > *))• Note that 

now the utility function gets the complexity of M as an ar- 
gument. For ease of exposition here, we restrict to deter- 
ministic TMs for most of the paper; we need to consider 
randomized TMs for our results on zero knowledge. 

Example 2.1 Consider the primality-testing problem dis- 
cussed in the introduction. Formally, suppose that the DM's 
type is just a natural number < 2 40 , and the DM must deter- 
mine whether the type is prime. The DM can choose either 
(the number is not prime), 1 (the number is prime), or 2 
(pass). If M is a TM, then M(s, t) is M's output in state s on 
input t. The state s here is used to capture the DM's uncer- 
tainty about the output. So if the DM believes that the DM 



will output pass with probability 2/3, then the set of states 
such that M(s, t) = 2 has probability 2/3. Let C(s, t, M) 
be if M computes the answer within 2 20 steps on input t, 
and 10 otherwise. (Think of 2 20 steps as representing repre- 
senting a hard deadline.) Here the state s encodes the DM's 
uncertainty about the running time of M. For example, if 
the DM does not know the running time of M, but ascribes 
probability 2/3 to M finishing in less than 2 20 steps on input 
t, then the set of states s such that C(s, t, M) = has prob- 
ability 2/3. Finally, let utility u(s, t, a,c) = 10 — c if a is 
either or 1, and this is the correct answer in state s (that is, 
t is viewed as prime in state s and a = 1, or t is not viewed 
as prime in state s and a = 0), and u(s,t,2,c) — 1 — c. Now 
the state s is used to encode the DM's uncertainty about the 
correctness of M's answer. (Note that we are allowing "im- 
possible" states, where t is viewed as prime in state s even 
though it is in fact composite; this is needed to model the 
DM's uncertainty.) Thus, if the DM is sure that M always 
gives the correct output, then u(s,t,a,c) = 10 — c for all 
states s and a £ {0, 1}. 

We can also consider a variant of this problem, where 
the DM is given a specific input t and is asked if t is prime. 
Although there is obviously a right answer (the number is 
prime or it's not), the DM might still have uncertainty re- 
garding whether a particular TM M gives the right answer, 
the running time of M, and the output of M. | 

Example 2.2 Consider the number-in-the-safe example 
from the introduction. Here there is only a single type, 
to; we can think of the state space S as consisting of 
pairs (si, 82,83), where si is the number in the safe, 
S2 is the combination, and S3 encodes the DM's beliefs 
about the complexity and correctness of TMs. An al- 
gorithm in this case is just a sequence of combinations 
to try and a stopping rule. Suppose that the agent gets 
utility 10 — C((si, S2, ss),to, M) if s 2 (the actual com- 
bination) is one of the numbers generated by M before 
it halts, and — C((si, S2, S3), to, M) otherwise, where 
C((sx, S2, S3), to, M) is if M halts within 2 20 steps in state 
(si, s 2 , S3), and 10 otherwise. | 

Example 2.3 (Biases in information processing) 

Psychologists have observed many systematic biases 
in the way that individuals update their beliefs as new 
information is received (see IRabin 19981 for a survey). 
In particular, a "first-impressions-matter" bias has been 
observed: individuals put too much weight on initial signals 
and less weight on later signals. As they become more con- 
vinced that their beliefs are correct, many individuals even 
seem to simply ignore all information once they reach a con- 
fidence threshold. Several papers in behavioral economics 
have focused on identifying and modeling some of these 
biases (see, e.g., |Rabin 1998] and the references therein, 
IIMullainathan 20021 , and |Rabin and Schrag 1999) ). In par- 
ticular, Mullainathan B20021 makes a potential connection 
between memory and biased information processing, using 
a model that makes several explicit (psychology-based) 
assumptions on the memory process (e.g., that the agent's 
ability to recall a past event depends on how often he has 
recalled the event in the past). More recently, Wilson [ 2002] 



has presented an elegant model of bounded rationality, 
where agents are described by finite automata, which 
(among other things) can explain why agents eventually 
choose to ignore new information; her analysis, however, is 
very complex and holds only in the limit (specifically, in the 
limit as the probability v that a given round is the last round 
goes to 0). 

As we now show, the first-impression-matters bias can be 
easily explained if we assume that there is a small cost for 
"absorbing" new information. Consider the following sim- 
ple game (which is very similar to the one studied by Mul- 
lainathan [2002] and Wilson [2002]). The state of nature is 
a bit b that is 1 with probability 1/2. An agent receives as 
his type a sequence of independent samples si,S2,. . . ,s„ 
where Sj = b with probability p > 1/2. The samples cor- 
responds to signals the agents receive about b. An agent is 
supposed to output a guess b' for the bit b. If the guess is 
correct, he receives 1 — mc as utility, and — mc otherwise, 
where m is the number of bits of the type he read, and c is 
the cost of reading a single bit (c should be thought of the 
cost of absorbing/interpreting information). It seems rea- 
sonable to assume that c > 0; signals usually require some 
effort to decode (such as reading a newspaper article, or at- 
tentively watching a movie). If c > 0, it easily follows by 
the Chernoff bound that after reading a certain (fixed) num- 
ber of signals si, . . . ,Si, the agents will have a sufficiently 
good estimate of p that the marginal cost of reading one ex- 
tra signal s,-_|_i is higher than the expected gain of finding out 
the value of Sj+i. That is, after processing a certain number 
of signals, agents will eventually disregard all future signals 
and base their output guess only on the initial sequence. We 
omit the straightforward details. 

Essentially the same approach allows us to capture belief 
polarization. Suppose for simplicity that two agents start out 
with slightly different beliefs regarding the value of some 
random variable X (think of X as representing something 
like "O.J. Simpson is guilty"), and get the same sequence 
Si, S2, ■ • • , s n of evidence regarding the value of X. (Thus, 
now the type consists of the initial belief, which can for ex- 
ample be modeled as a probability or a sequence of evidence 
received earlier, and the new sequence of evidence). Both 
agents update their beliefs by conditioning. As before, there 
is a cost of processing a piece of evidence, so once a DM 
gets sufficient evidence for either X = or X = 1, he will 
stop processing any further evidence. If the initial evidence 
supports X = 0, but the later evidence supports X — 1 even 
more strongly, the agent that was initially inclined towards 
X = may raise his beliefs to be above threshold, and thus 
stop processing, believing that X = 0, while the agent ini- 
tially inclined towards X = 1 will continue processing and 
eventually believe that X = 1 . | 

Example 2.4 (Status quo bias) The status quo bias is well 
known. To take just one example, Samuelson and Zeck- 
hauser [19981 observed that when Harvard University pro- 
fessors were offered the possibility of enrolling in some new 
health-care options, older faculty, who were already enrolled 
in a plan, enrolled in the new option much less often than 
new faculty. Assuming that all faculty evaluate the plans in 



essentially the same way, this can be viewed as an instance 
of a status quo bias. Samuelson and Zeckhauser suggested a 
number of explanations for this phenomenon, one of which 
was computational. As they point out, the choice to un- 
dertake a careful analysis of the options is itself a decision. 
Someone who is already enrolled in a plan and is relatively 
happy with it can rationally decide that it is not worth the 
cost of analysis (and thus just stick with her current plan), 
while someone who is not yet enrolled is more likely to de- 
cide that the analysis is worthwhile. This explanation can 
be readily modeled in our framework. An agent's type can 
be taken to be a description of the alternatives. A TM de- 
cides how many alternatives to analyze. There is a cost to 
analyzing an alternative, and we require that the decision 
made be among the alternatives analyzed or the status quo. 
(We assume that the status quo has already been analyzed, 
through experience.) If the status quo already offers an ac- 
ceptable return, then a rational agent may well decide not to 
analyze any new alternatives. Interestingly, Samuelson and 
Zeckhauser found that, in some cases, the status quo bias is 
even more pronounced when there are more alternatives. We 
can capture this phenomenon if we assume that, for exam- 
ple, that there is an initial cost to analyzing, and the initial 
cost itself depends in part on how many alternatives there 
are to analyze (so that it is more expensive to analyze only 
three alternatives if there are five alternatives altogether than 
if there only three alternatives). This would be reasonable if 
there is some setup cost in order to start the analysis, and the 
setup depends on the number of items to be analyzed. | 

3 Value of computational information 

3.1 Value of information: a review 

Before talking about value of computational information, we 
briefly review value of information. Consider a standard de- 
cision problem. To deal with value of information, we con- 
sider a partition of the state space S. The question is what it 
would be worth to the DM to find out which cell in the parti- 
tion the true state is in. (Think of the cells in the partition as 
corresponding to the possible realizations of a random vari- 
able X, and the value of information as corresponding to the 
value of learning the actual realization of X.) Of course, the 
value may depend on the DM's type t. To compute the value 
of information, we compute the expected expected utility of 
the best action given type t conditional on receiving the in- 
formation, and compare it to the expected utility of the best 
action for type t before finding out the information. We talk 
about "expected expected utility" here because we need to 
take into account how likely the DM is to discover that he is 
in a particular cell. 

Example 3.1 Suppose that an investor can buy either a 
stock or bond. There are two states of the world, si and 
S2, and a single type to- A priori, the investor thinks si 
has probability 2/3 and S2 has probability 1/3. Buying the 
bond gives him a guaranteed utility of 1 (in both si and 52)- 
In state s%, buying the stock gives a utility of 3; in state 
S2, buying the stock gives a utility of —4. Clearly, a priori, 
buying the stock has an expected utility of 2/3, so buying 
the bond has a higher expected utility. What is the value 



of learning the true state (which corresponds to the partition 
{{si}, {S2}})? Clearly if the true state is si, buying the 
stock is the best action, and has (expected) utility 3; in state 
S2, buying the bond is the best action, and has expected util- 
ity 1 . Thus, the expected expected utility of the information 
is (2/3)3 + (1/3)1 = 7/3 (since with probability 2/3 the 
DM expects to learn that it is state s% and with probability 
1/3 the DM expects to learn that it is S2), and so the value 
of information is 7/3 — 1 = 4/3. | 

We leave it to the reader to write the obvious formal defi- 
nition of value of information in type t. 

3.2 Value of computational information 

In our framework, it is easy to model the value of com- 
putational information: it is just a special case of value of 
information. Formally, given a standard decision problem 
(S, T, A, Pr, u), we must first extend it to a computa- 
tional decision problem (S',T,A,Pt, M,C,0,u'). M 
is some appropriate set of TMs; each TM in M outputs 
an action in A given an element of S' x T. As discussed 
in Section [2] we need a richer state space to capture the 
DM's uncertainty regarding the output of the TM and 
the running time of the TM chosen. We can take S' to 
have the form S x S", where s" € S" determines the 
running time and output of each TM M G A4. Sim- 
ilarly, u'((s, s"),t , M((s, s"), t ), C((s, a"), t , M)) 
depends on u(s, M((s, s"),t)) and C((s, s"), t, M). 
(For example, we can assume that 
«'((*, s"), t , M'((s, s"), to),C((s, s"), to, M)) 
u(s, M(s, t))—C((s, s"),t, M), but we do not require this.) 

In this setting, value of computational information essen- 
tially becomes a special case of value of information. The 
only difference is that since the machine set Ai might be 
infinite, there might not exist a machine with maximal ex- 
pected utility. So, instead of comparing the expected utilities 
of the best machines (before and after receiving the informa- 
tion), we compare the supremum of the expected utilities of 
machine M G M. (before and after receiving the informa- 
tion). More precisely, given a partition Q of the state of 
nature, for every cell q G Q, let Pr 9 denote the distribution 
Pr conditioned on event that the state of nature is part of the 
cell q. and let the random variable q(s, t) denote the cell of 
s. The value of computational information (of learning what 
cell q G Q the state of nature is in) is 



sup E Pl [u' M ] 

MeM 



- sup E Pr [u' M ] . (1) 
MeM 



That is, on the left-hand side, we compute the expected ex- 
pected utility by summing Pr(s, t) swp MeM #Pr, (s , t) [u' M ] 
over all pairs (s, t) G S" x T. Effectively, this means that 
the DM chooses the best TM for each cell, after being in- 
formed what the cell is. We discuss this issue in more detail 
in Section [3~3l 

Using this formalism, we can consider the value of learn- 
ing that a particular TM M is a "good" algorithm for the 
problem at hand (i.e., either learning that it always gives the 
correct answer, or always runs quickly), since this is just an 
event, just like learning the value of some random variable 



X is an event in a standard decision problem. In a compu- 
tational decision problem, the DM has a prior probability on 
M being good, and can compute the expected increase in 
utility resulting from learning that M is good. 

Example 3.2 Consider the primality-testing problem from 
Example 12.11 viewed as a computational decision problem 
(S, T, A, Pr, M, C, O, u'). Given the utility function, for 
simplicity, we restrict M. to to be a finite set of TMs that 
all halt within 2 20 steps. Thus, the DM is certain of the 
complexity of all TMs in M, and it is 0. On the other hand, 
the DM can still be uncertain about the output of a TM, and 
of the "goodness" of the output. For example, if M is a 
TM that halts after one step and outputs 0, the DM may be 
certain that M's output is 0, but be uncertain as to the "good- 
ness" of its output. Of course, such an algorithm might still 
be worth using: if the agent places a high prior probability 
on the input not being prime (which would be the case if the 
input was chosen uniformly at random among all numbers 
less than 2 40 ), then the expected utility of answering for 
all inputs is quite high. A yet better algorithm would be to 
use some naive test for primality, run it for 2 20 steps, and 
return unless the algorithm says that the number is prime. 
The DM can then ask what the value is of learning whether 
a specific TM M is good (i.e., returns the correct answer 
for all inputs). This depends on the DM's prior probabil- 
ity that M is good; but if it is low, then the value of infor- 
mation is also low. Finally, we can ask the value of being 
told a good algorithm (assume that the DM is certain that 
there is a good algorithm, which always returns the right an- 
swer in less than 2 20 steps, but doesn't know which it is). 
This amounts to learning the value of a random variable X 
whose range is a subset of M., where X — M only if M 
is a good TM. Clearly, after learning this information, the 
DM's expected expected utility will be 10 (no matter what 
he learns, his expected utility will be 10). The value of this 
information depends on the expected utility of the DM's best 
current algorithm. Note that if the DM believes that the in- 
put is chosen uniformly at random, then the expected utility 
of even the simple algorithm that returns no matter what is 
close to 10. On the other hand, if the DM believes that the 
input is chosen so that primes and non-primes are equally 
likely, the best algorithm is unlikely to have expected util- 
ity much higher than 1 (the best strategy is likely to involve 
testing whether the number is prime, outputting the answer 
if the tests reveal whether the number is prime within 2 20 
steps, and outputting 2 otherwise). In this case, the value of 
this information would be close to 9. | 

Example 3.3 Consider the number-in-the-safe example, 
viewed as a computational decision problem V = 
(S, T, A, Pr, M,C, O, it'). Recall that the state space S has 
the form (si, S2, S3), where si is the number in the safe, 
S2 is the combination of the safe, and s 3 models the DM's 
uncertainty regarding the output of TMs and their running 
time. There is only a single type, so we can take T = {to}. 
We have the obvious uniform probability on the first two 
components of S. Again, we restrict M. to algorithms that 
halt within 2 20 steps. If it takes one time unit to test a par- 
ticular combination, and the DM believes that the best ap- 



proach is to generate some sequence of 2 20 combinations 
and test them, then it is clear that the DM believes that 
the expected utility of this approach is 2~ 20 (1, 000, 000). 
Learning the first 20 digits makes the problem feasible, and 
thus results in an expected expected utility of 1, 000, 000 (no 
matter which 20 digits are the right ones, the expected utility 
is 1, 000, 000), and so has a high value of information. | 

3.3 Value of conversation 

Recall that, for value of information, we consider how much 
it is worth for a DM to find out which cell (in some partition 
of the state space S) the true state s is in. In other words, we 
consider the question of how much it is worth for the DM 
to learn the value of f(s) of some function / on input the 
true state s. A more general setting considers how much it 
is worth for a DM to interact with another TM I (for infor- 
mant) that is running on input the true state s. 

Example 3.4 Suppose a number between 1 and 100 is cho- 
sen uniformly at random. If the DM guesses the number cor- 
rectly, he receives a utility of 100; otherwise, he receives a 
utility of 0. Without any further information, the DM clearly 
cannot get more than 1 in expected utility. But if he can 
sequentially ask 7 yes/no questions, he can learn the num- 
ber by using binary search (i.e., first asking if the number is 
greater than 50; if so, asking if it is greater than 75; etc.), 
getting a utility of 100. Thus, the value of a conversation 
with a machine that answers 7 yes/no questions is 99. I 

The value of conversation with (a TM) I for standard de- 
cision problem can be formalized in exactly the same way 
as value of information. Formalizing computational value 
of conversation requires extending the notion of computa- 
tional decision problems to allow the DM to choose among 
interactive Turing machines M (this was already done in 
IHalpern and Pass 2010|). We omit the formal defini- 
tion of an interactive Turing machine (see, for example, 
flGoldreich 20011 ); roughly speaking, the machines use a 
special tape where the message to be sent is placed and an- 
other tape where a message to be received is written. We 
assume that the DM chooses a TM M, M then proceeds in 
two phases. First there is a communication phase, where M 
converses with the informant /; then, after the communica- 
tion phase is over, M chooses an action for the underlying 
decision problem. Note that what an interactive TM does 
(that is, the message it sends or the action it takes after the 
communication phase is over) can depend on its input, the 
history of messages received, and the random coins it tosses 
(if it randomizes). 

When considering an interactive TM M, we assume that 
the complexity function C depends not only on the machine 
M and its type t, but also on the messages that the DM 
receives, and its random coin tosses. More precisely, we 
define the view of an interactive machine M to be a string 
t; h; r in {0, 1}*; {0, 1}*; {0, 1}*, where t is the part of the 
type actually read by M, r is a finite bitstring representing 
the string of random bits actually used, and h is a finite se- 
quence of messages received and read. If v = t; h; r, we 
take M(v) to be the output of M given the view. (Note 
that M(v) is either a message or an action in the underly- 



ing decision problem, if the conversation phase is over.) We 
now consider output functions O:MxSx{0,l}*-> JN, 
where M denotes a set of (interactive) Turing Machines, and 
let 0(M,s,v) describe what the DM thinks the output of 
the machine M is, given the view v, if the state of nature 
is s. Analogously, we now consider complexity functions 
C:MxSx{0,l}MJV, and let C(M, s, v) describe the 
complexity of the machine M given the view v if the state 
of nature is s. 

When running with M, I gets as input the actual state 
s (we want to allow for the possibility that I has access to 
some featuers of the world that M does not). That means 
that the state s is playing a double role here; it is used both 
to capture the fact that M is interacting (in part) with nature, 
and may get some feedback from nature, and to model the 
DM's uncertainty about the world. To formalize the compu- 
tational value of conversation with I, let the random variable 
view ' M (s, t,rj,VM) denote the view of the DM in state s 
at the end of the communication phase when communicating 
with I (running on input s with random tape 77) if the DM 
uses the machine M (running on input t with random tape 
fAi)- We assume that view 1 ' (s, £, 77, 77^) is generated by 
computing the messages sent by M and I at each step us- 
ing O; that is, M's first message is 0(M, s, vq), where vq is 
M's initial view t; (); r' M , where ( } denotes the empty his- 
tory, and r' M is a prefix of tm> M's sequence of random bits 
(however much randomness M used to determine its first 
message); similarly, Fs first message is 0(1, s,vi), where 
V\ = s; (mo); r\, r\ is a finite prefix of 77, and too is the 
first message sent by M; and so on. This means that M's 
beliefs about the sequence of messages sent is determined 
by his beliefs about the individual messages sent in all cir- 
cumstancesQ 

Let Pr + denote the distribution on S x T x ({0, 1} 00 ) 2 
that is the product of Pr and the uniform distribu- 
tion on pairs of random strings. For each pair (J, M) 
of interactive TMs, we consider the random variable 
u' I M defined on S x T x ({0, 1} 00 ) 2 ) by taking 
u' IM (s,t,ri,r M ) = u'(s,t,0(M,s,v),C(M,s,v)), 
where v = view /,M (s, t, 77, 77/). That is, 
u'j m( s > r i 1 t m) describes the utility of the actions 
that result when M converses with I in state s given input 
t and random tape 77 for I and tm for M, taking the 
complexity of the interaction into account. The expected 
utility of M when communicating with I is E Pr + [u' T M \. 

The computational value of conversation with I is now 
defined as 

sup E Pl + [uj M ] - sup E Pl .+ [u' X M \ , (2) 
MeM MeM 

where _l_ is the "silent" machine that sends no messages. 
That is, we compare the expected utility of best machine 
communicating with I and the expected utility of the best 
machine that runs in isolation (i.e., is communicating with 
■!). 

4 We can allow for M's beliefs about the sequence of messages 
sent to be independent of his beliefs about individual messages, at 
the price of complicating the framework. 



There is a subtlety in this definition that is worth em- 
phasizing. In general, when defining determining the best 
choice of TM, we must ask whether it is reasonable to as- 
sume that the TM knows it's input. That is, is the choice 
of TM being made before the DM knows the input, or af- 
ter? For example, in the primality-testing problem of Ex- 
ample [2J] does the DM choose a TM before knowing what 
number is or after. The answer to this question has no im- 
pact if we do not take complexity into account, but it has a 
major impact if we do consider complexity. Clearly, if we 
know what the input n is, we can choose a TM that is likely 
to give the right answer for M. There is clearly a very effi- 
cient TM that gives the right answer for a specific input n; it 
is the constant-time TM that just says "yes" if n is prime, or 
the constant-time TM that just says "no" if n is not prime. 
Of course, if there is uncertainty as to the quality of the TM, 
the DM may be uncertain as to what utility he gets with each 
choice. But the complexity is guaranteed to be low. On the 
other hand, if the choice of TM must be made before the TM 
knows the input, even if the DM understands the quality of 
the TM chosen, there may be no efficient TM that does well 
for all possible inputs. 

Whether it is appropriate to assume that the TM is cho- 
sen before or after the DM knows the input depends on the 
application. For the most part, in [Halpern and Pass 2010 1, 
we implicitly assumed that the choice was made before the 
DM knew the input; this seemed reasonable for the applica- 
tions of that paper. Here, in the definition of value of com- 
putational information, we implicitly assumed that the DM 
chose the best TM after learning the cell q (but before learn- 
ing the input t). We could also have computed the value 
of computational information under the assumption that the 
TM had to be chosen before discovering q. This would have 
amounted to putting the sup outside the scope of the E Pl in 
Equation (Q]); this would have given 



sup E P[ [E Pl [u' m ]] 

MgM 



sup E Pl [u' M ] 
MeM 



(3) 



Here we are implicitly assuming that the TM M chosen 
takes the cell q(s, t) as an input; moreover, the TM "under- 
stands" that the "right" thing to do with q(s,t) is to con- 
dition (and thus, to compute the expectation using Pr q ). 
Again, it is possible to allow more generality — the TM does 
not have to condition; the definition of computational value 
of of conversation implicitly allows this. While ([3]) is a per- 
fectly sensible definition, it seems less appropriate when 
considering value of information, where a DM might be 
willing and able to devote a great deal of computation to a 
problem after getting information (although there may well 
be cases where (f3]) is indeed more appropriate than (Q]l). 

By way of contrast, in we are implicitly assuming that 
the DM must choose the interactive TM before learning the 
conversation; he does not get to choose a different one for 
each conversation. We are evaluating the value of conver- 
sation with /, rather than the value of a particular conversa- 
tion with /. This is why we do not consider the expected 
expected utility of the best algorithm after receiving the in- 
formation, but rather consider the expected utility of "com- 
municating, interpreting, and finally acting". Intuitively, we 



are assuming that a DM must choose a TM to interpret and 
make use of the information gleaned from the conversation; 
we want to take the cost of doing this interpretation into ac- 
count, by choosing a TM that is able to interpret all possible 
computations. 

We could in principle define a notion of value of partic- 
ular conversations with /, rather than the value of convers- 
ing with /, by assuming that the DM chooses one TM that 
decides how to converse with /, and then, after the con- 
versation, chooses the best TM to take advantage of that 
particular conversation. Thus, at the second step, the TM 
chosen would depend on the conversation. Formally, this 
amounts to having another sup inside the scope of E Pl + , but 
this seems less appropriate here. 

If we do not take the cost of computation into account, 
whether we learn the conversation before or after making 
the choice of TM is irrelevant. Indeed, the value of conver- 
sation can be viewed as a special case of value of informa- 
tion: for each "conversation-strategy" a for the DM, simply 
consider the value of receiving a transcript of the conversa- 
tion between I(s) and a(t) (where t is the type of the DM). 
The value of conversation with / is then simply the maxi- 
mum value of information over all conversation strategies a. 
By way of contrast, we cannot reduce computational value 
of conversation to value of information. If there is a com- 
putational cost associated with computing the messages to 
send to /, the value of a conversation is no longer just the 
maximum value of information. 

Example 3.5 Consider the guess-the-number decision 
problem from Example 13.41 again. What is the value of 
a conversation with an informant I that picks two large 
primes p and q, and sends the product N = pq to the 
DM? If the DM manages to factor N, I sends the DM the 
number chosen; otherwise I simply aborts. Clearly, the 
value of information in the "best" conversation is 99 (the 
DM learns the number and gets a utility of 100). However, 
to implement this conversation requires the DM to factor 
large number. If computation is costly and factoring is hard 
(as is widely believed), it might not be worth it for the 
DM to attempt to factor the numbers. Thus, the value of 
conversation with I would be (or close to 0). I 

3.4 Value of conversation and zero knowledge 

The notion of a zero-knowledge proof 
[Goldwasser, Micali , and Rackoff 19891 is one °f the 
central notions in cryptography. Intuitively, a zero- 
knowledge proof allows an agent (called the prover) to 
convince another agent (called the verifier) of the validity 
of some statement x, without revealing any additional 
information. For instance, using a zero-knowledge proof, 
a prover can convince a verifier that a number N is the 
product of 2 primes, without actually revealing the primes. 
The zero-knowledge requirement is formalized using the 
so-called simulation paradigm. Roughly speaking, a proof 
(P, V) (consisting of a strategy P for the prover, and 
a strategy V for the verifier) is said to be perfect zero 
knowledge if, for every verifier strategy V, there exists 
a simulator S that can reconstruct the verifier's view of 



the interaction with the prover with only a polynomial 
overhead in runtimeQ Note that the simulator is running in 
isolation and, in particular, is not allowed to interact with 
the prover. Thus, intuitively, in a zero-knowledge proof, 
the verifier receives only messages from the prover that 
it could have efficiently generated on its own by running 
the simulator S. The notion of precise zero-knowledge 
[Micali and Pass 2006 1 aims at more precisely quantifying 
the knowledge gained by the verifier. Intuitively, a zero- 
knowledge proof of a statement x has precision p if any 
view that the verifier receives in time t after talking to the 
prover can be reconstructed by the simulator (i.e., without 
the help of the prover) in time p(\x\,t). (There is nothing 
special about time here; we can also consider precision with 
respect more general complexity measures.) 

As we now show, there is a tight connection between the 
value of conversation for computational decision problems 
and zero knowledge. To explain the ideas, we first need to 
introduce a new notion, which should be of independent in- 
terest: value of computational speedup. 

Computers get faster and faster. How much is it worth for 
a DM to get a faster computer? To formalize this, we say that 
a complexity function C is at most a p-speedup of the com- 
plexity function C if, for all machines M, types t, and states 
s, C'(M,s,t) < C(M,s,t) < p(C'(M,s,t)). Intuitively, if 
p is a constant, the value of a ^-computational speedup for 
a DM measures how much it is worth for the DM to change 
to a machine that runs p times faster than his current ma- 
chine. More precisely, the value of a p-speedup in a compu- 
tational decision problem V = (S, T, A, Pr, M, C, G, u') is 
the difference between the maximum expected utility of the 
DM in T> and the maximum expected utility in any decision 
problem V that is identical to D except that the complexity 
function in V is C, where C is at most a p-speedup of C. 

We now sketch the connection between zero-knowledge 
and value of conversation. Given a language L, an objective 
complexity function C : M x T — > IN (one that does not 
depend on the state of nature), and length parameter n, let 
Z?£ n denote the class of computational decision problems 
V = (S, T, A, Pr, C, O, M, u), where M is the set of in- 
teractive Turing machines, S C {0, 1}", types in T have the 
form x; t', where x e S and t' 6 {0,1}*, and Pr is such that 
Pr(s, t) > only if s = x, t = x; t 1 , and x £ L (so that the 
DM knows x and that x G L). We also require that (1) the 
DM does not have any uncertainty about the output and the 
complexity functions: for all M,s,t, G(M,s,t) = M(t) 
(so the DM knows the correct outputs of all machines) and 
C'(M, s,t) — C(M, t) (so the DM knows the complexities 
of all machines); and (2) V is monotone in complexity: for 
all types t S T, actions a 6 A, and complexities c < c', 
u(t, a, c) > u(t, a, c'); that is, the DM never prefers to com- 
pute more. In the full paper we prove the following. 

Theorem 3.6 If(P, V) is a zero-knowledge proof system for 
the language L with precision p(- y •) with respect to the com- 
plexity function C, then for all n £ N and all computational 
decision problem T> £ £>£ n , the value of conversation with 

technically, what is reconstructed is a distribution over views, 
since both the prover and the verifier may randomize. 



P in T> is no higher than the value of a p(n, ^-computational 
speedup in T>. 

Thus, intuitively, if the DM is not uncertain about the com- 
plexities and the outputs of machines, the value of partic- 
ipating in a zero-knowledge proof is never higher than the 
value of (appropriately) upgrading computers. 

4 Discussion and Related Work 

We have introduced a formal framework for decision mak- 
ing that explicitly takes into account the cost of computation. 
Doing so requires taking into account the uncertainty that a 
DM may have about the running time of an algorithm, and 
the quality of its output. The framework allows us to provide 
formal decision-theoretic solutions to well-known observa- 
tions such as the status-quo bias and belief polarization. 

Of course, we are far from the first to recognize that deci- 
sion making requires computation — computation for knowl- 
edge acquisition and for inference. Nor are we the first to 
suggest that the costs for such computation should be ex- 
plicitly reflected in the utility function. Horvitz [ 1987 1 cred- 
its Good [ 1952 1 for being the first to explicitly integrate the 
costs of computation into a framework of normative ratio- 
nality. For example, Good points out that "less good meth- 
ods may therefore sometimes be preferred" (for computa- 
tional reasons). In a sequence of papers (see, for example, 
BHorvitz 19871 IHorvitz 2 001 1 and the references therein), 
Horvitz continues this theme, investigating various policies 
that trade off deliberation and action, taking into account 
computation costs. The framework presented here could be 
used to provide formal underpinnings to Horvitz's work. 

In terms of next steps, we have considered only one-shot 
decision problems here. It would be very interesting to ex- 
tend this framework to sequential decision problems. More- 
over, we have assumed that agents can compute the proba- 
bility of (or, at least, are willing to assign a probability to) 
events like "TM M will halt in 10,000 steps" or "the out- 
put of TM M solves the problem I am interested in on this 
input". Of course, calculating such probabilities itself in- 
volves computation. Similarly, calculating utilities may in- 
volve computation; although the utility was easy to compute 
in the simple examples we gave, this is certainly not the case 
in general. It would be relatively straightforward to extend 
our framework so that the TMs computed probabilities and 
utilities, as well as actions. In this setting, it may make 
sense to allow for a more general representation of uncer- 
tainty. That is, an agent may start with a set of probabilities 
rather than a single probability, and may then refine that set 
(perhaps to a single probability) over time. Similarly, an 
agent may start with a set of possible utilities, rather than a 
single utility. 

Once we allow sets of probabilities and utilities, we need 
to reconsider how to define the notion of "optimal choice". 
We could, for example, use the maxmin expected utility ap- 
proach of Gilboa and Schmeidler [1989], associating with 
each action the worst-case expected utility (over all proba- 
bility distributions and utility functions considered possible) 
and choose the action with the best worst-case expected util- 
ity; other approaches may also be reasonable. However, 



once we do this, we need to think about what counts as an 
"optimal" decision if the DM does not have a probability and 
utility, or has a probability only on a coarse space. An alter- 
native approach might be to allow the set of TMs that the 
DM considers possible to increase (at some computational 
cost), but assume that DM has all the relevant probabilistic 
information about the TMs that it can choose among. As 
this discussion should make clear, there is much fascinating 
research to be done in this area. 

Considering sequential decision-making also allows us to 
examine consistency of decisions. Taking cost of computa- 
tion into account may make decisions appear consistent that 
are not consistent without taking cost of computation into 
account. As this discussion should make clear, there is much 
fascinating research to be done in this area. 
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